Round 1: Technical Interview (30 Minutes)
📍 Walkthrough of the candidate's past projects, highlighting challenges and solutions.
📍 Questions related to the architecture and implementation of the project.
📍 Understanding of Databricks architecture and its integration with Azure services.
📍 Performance optimization techniques in Databricks.
📍 Fetching Data from Azure Data Lake and Storing it in SQL Database
📍 Write a PySpark script to read data from Azure Data Lake Storage (ADLS) and write it to a SQL database.
📍 Explanation of Inner Join, Left Join, Right Join, Full Outer Join, Cross Join, and Semi Join.
📍 Demonstrate how Inner Join and Left Join behave with sample data.
📍 Write a PySpark script to perform joins between two DataFrames.
📍 Write a PySpark script to drop specific columns from a DataFrame.
📍 Write a PySpark script to filter specific rows based on conditions.
Round 2: Techno-Managerial Interview (45 Minutes)
📍 Deep dive into the design choices, scalability, and optimizations of the candidate's project.
📍 Discussion on real-world challenges faced and how they were tackled.
📍 Handling failed or late records using Azure Stream Analytics.
📍 Best practices for checkpointing, event retention, and data reprocessing.
📍 Top Product Location-wise (Window Function Problem)
📍 Write a PySpark script to find the top product for each location using Window Functions.
📍 Write a PySpark script to identify duplicate records in a DataFrame.
📍 Overview of Spark Architecture (Driver, Executors, Cluster Manager).
📍 Internal Working of Spark (DAG, Stages, Tasks, and Execution Plan).
📍 Types of activities in ADF (Data Flow, Copy, Lookup, Web, ForEach, etc.).
Round 3: HR Discussion
📍 Salary Negotiation & Expectations
📍 Company Culture & Benefits Discussion
📍 Career Growth & Future Opportunities